Testing 10 deer samples to see differences within BLAST algorithms (and the accuracy of BLAST itself) for ambiguous ASVs in significant quantities.

BLASTn

Column

BLASTn (without manually altering)

DISCONTIGUOUS MEGABLAST

Column

Discontiguous Megablast

MEGABLAST

Column

Megablast

IDTAXA Th30

Column

IDTAXA Th30

IDTAXA Th60

Column

IDTAXA Th60

ASSIGNTAXONOMY

Column

ASSIGNTAXONOMY

Introduction

BLASTn


BLASTN: slow, but allows word-size down to seven bases.

Within the hovertext there is a line called SpeciesASV. This is what IDTAXA (Th30) has classified the ASV as.

Notice the lighter green (Haemonchus contortus). For this plot, I have left all classifications how they were outputted by BLAST to show how drastic the difference could look. For the next two plots, I have manually “fixed” the classification of Haemonchus contortus based on the manual revision of the ASV on Geneious.

DISCONTIGUOUS MEGABLAST


DISCONTIGUOUS MEGABLAST: uses initial seed that ignores some bases (allowing mismatches) - intended for cross-species comparisons.

Notice here “Haemonchus contortus”" is now “Haemonchus contortus but actually placei”. The rest of Haemonchus contortus (ASV 7 and ASV 21) was analyzed and discovered to be classified as such due to poor query coverage (16%), which is reflected in the pipeline output as a low bitscore.

MEGABLAST


MEGABLAST: comparing a query to closely related sequences - works best if target percent identity >95%.

Comparing Three BLAST Algorithms

Column

BLASTn (left without manually altering)


Discontiguous Megablast


Megablast

Column

BLASTn (without manually altering)


Discontiguous Megablast


Megablast

Comparing BLAST to Pipeline Classification

Column

BLASTn (without manually altering)


Discontiguous Megablast


Megablast

Column

IDTAXA Th30


IDTAXA Th60


ASSIGNTAXONOMY

Comparing IDTAXA and ASSIGNTAXONOMY

Column

IDTAXA Th30


IDTAXA Th60


ASSIGNTAXONOMY

Column

IDTAXA Th30


IDTAXA Th60


ASSIGNTAXONOMY